25/03/2021

Updates

Welcome to Zoom

Online lecture mode

  • All lectures via Zoom, same time/day as usual.
    • Lectures are recorded (30 days available).
    • Preferably no breaks, max. 90 minutes straight.
  • Materials online, as usual.
  • Last two sessions (7 May, 14 May): Q&A session in Zoom instead of presentations/discussion in classroom.

Online examination mode

  • Part I: take-home exercises: No changes. To be handed out on 7 May, to be handed in on 8 June, 16:00.
  • Part II: project presentations: presentations recorded as ‘screencast’ (voice-over-slides).
    • Basically still the same requirements: use Rmd to create slides, presentations of 6-7 minutes max., etc. The only difference is how you deliver your presentation.
    • See here for tips on how to make a screencast.
    • Hand in your presentations by 14 May 2020, 23:59.
    • See assignment in StudyNet/Canvas.

Recap Week 4

Bindings basics

  • Objects/values do not have names but names have values!
  • Objects have a ‘memory address’/identifiers.
x <- c(1, 2, 3)

Copy-on-modify

  • If we modify values in a vector, actual ‘copying’ is necessary (depending on the data structure of the object…).

Data structures and modify-in-place

Improving performance

  • Bottleneck(s) identified, what now?
  • See previous examples for typical problems in a data analytics context.
  • Vast variety of potential bottlenecks. Hard to give general advice.

Programming with Big Data

  1. Which basic (already implemented) R functions are more or less suitable as building blocks for the program?
  2. How can we exploit/avoid some of R’s lower-level characteristics in order to implement efficient functions?
  3. Is there a need to interface with a lower-level programming language in order to speed up the code? (advanced topic)
  • Independent of how we write a statistical procedure in R (or in any other language, for that matter), is there an alternative statistical procedure/algorithm that is faster but delivers approximately the same result.

Issues to keep in mind

  • Vectorization.
  • Memory: avoid copying, pre-allocate memory.
  • Use built in primitive (C) functions (caution: not always faster, if aim is precision).
  • Existing solutions: load additional packages (read.csv() vs. data.table::fread()).
    • Focus of what follows in this course (approach taken in Walkowiak (2016)).

Procedural view and further reading

Goals for today

  1. Know basic strategies for out-of-memory operations in R.
  2. Know basic tools for local big data cleaning and transformation in R.
  3. Understand (in simple terms) how these tools work.
  4. (Recap of virtual memory concept)

Virtual Memory

Virtual memory

  • Operating system allocates part of mass storage device (hard-disk) as virtual memory.
  • Process/application uses up too much RAM, OS starts swapping data between RAM and virtual memory.
  • Processes slow down due to swapping.
  • Default (OS) usage of virtual memory concept is not necessarily optimized for data analysis tasks.

Virtual memory

Virtual memory: example (linux)

‘Out-of-memory’ strategies

  • Use virtual memory idea for specific data analytics tasks.
  • Two approaches:
    • Chunked data files on disk: partition large data set, map and store chunks of raw data on disk. Keep mapping in RAM. (ff-package)
    • Memory mapped files and shared memory: virtual memory is explicitly allocated for one or several specific data analytics tasks (different processes can access the same memory segment). (bigmemory-package)

Chunking data with the ff-package

Preparations

# SET UP --------------

# install.packages(c("ff", "ffbase"))
# load packages
library(ff)
library(ffbase)
library(pryr)

# create directory for ff chunks, and assign directory to ff 
system("mkdir ffdf")
options(fftempdir = "ffdf")

Chunking data with the ff-package

Import data, inspect change in RAM.

##             used  (Mb) gc trigger   (Mb)  max used   (Mb)
## Ncells   1393114  74.5    2150240  114.9   2150240  114.9
## Vcells 122509739 934.7  213343868 1627.7 211038278 1610.1
mem_change(
flights <- 
     read.table.ffdf(file="../data/flights.csv",
                     sep=",",
                     VERBOSE=TRUE,
                     header=TRUE,
                     next.rows=100000,
                     colClasses=NA)
)
## read.table.ffdf 1..100000 (100000)  csv-read=0.398sec ffdf-write=0.053sec
## read.table.ffdf 100001..200000 (100000)  csv-read=0.386sec ffdf-write=0.035sec
## read.table.ffdf 200001..300000 (100000)  csv-read=0.403sec ffdf-write=0.042sec
## read.table.ffdf 300001..336776 (36776)  csv-read=0.161sec ffdf-write=0.019sec
##  csv-read=1.348sec  ffdf-write=0.149sec  TOTAL=1.497sec
## -31.6 MB

Chunking data with the ff-package

Inspect file chunks on disk and data structure in R environment.

# show the files in the directory keeping the chunks
list.files("ffdf")
##   [1] "clone1664b7fbd953f.ff" "clone1664b9b8cca9.ff"  "clone1e7014c0a1cd8.ff"
##   [4] "clone1e7015a4f712e.ff" "clone2aea22211d9e1.ff" "clone2aea2360c6703.ff"
##   [7] "clone2aea2566ab42d.ff" "clone2aea25e1c1f75.ff" "clone2d49618dbfbf6.ff"
##  [10] "clone2d4965ee3349a.ff" "clone2d49664b07745.ff" "clone2d49672b82b88.ff"
##  [13] "clone308112a4ca401.ff" "clone308113d044b7c.ff" "clone308113d22fb5f.ff"
##  [16] "clone3081149714ed4.ff" "clone399cd5627eb1f.ff" "clone399cd72c6506d.ff"
##  [19] "clone399cd78f6c4e6.ff" "clone399cd8b1f075.ff"  "clone3c3ef1e38eca1.ff"
##  [22] "clone3c3ef4ac46441.ff" "clone3c3ef514956e9.ff" "clone3c3efcb5fb24.ff" 
##  [25] "clone3f8e9146bba78.ff" "clone3f8e94d9633f0.ff" "clone3f8e9506fcdf1.ff"
##  [28] "clone3f8e9b9a7b8.ff"   "clone432452f2dbbc3.ff" "clone4324578aefe51.ff"
##  [31] "clone4324579d1ad52.ff" "clone432457a4743f9.ff" "ff1664b222c38f0.ff"   
##  [34] "ff1664b4d23ee78.ff"    "ff1664b4d7f1e3e.ff"    "ff1e7011754e092.ff"   
##  [37] "ff1e7011a76d5a6.ff"    "ff1e7017084631e.ff"    "ff2aea22c3703b9.ff"   
##  [40] "ff2aea2664ee33.ff"     "ff2aea26b164ce7.ff"    "ff2d49627cd458.ff"    
##  [43] "ff2d49631ca5a34.ff"    "ff2d4964237cc21.ff"    "ff30811207897c4.ff"   
##  [46] "ff308115699c1a.ff"     "ff30811b3430e.ff"      "ff399cd1a2ffc0e.ff"   
##  [49] "ff399cd1e963877.ff"    "ff399cd5477c29.ff"     "ff3c3ef17300293.ff"   
##  [52] "ff3c3ef2229a09b.ff"    "ff3c3ef765d6fb7.ff"    "ff3f8e9435a0018.ff"   
##  [55] "ff3f8e9615e63d.ff"     "ff3f8e96a58ed50.ff"    "ff432453e724ac7.ff"   
##  [58] "ff432454dd9f249.ff"    "ff432455fb42793.ff"    "ffdf1664b11f63957.ff" 
##  [61] "ffdf1664b12d379a7.ff"  "ffdf1664b16a5a516.ff"  "ffdf1664b16cc8da8.ff" 
##  [64] "ffdf1664b16d72904.ff"  "ffdf1664b178a08cd.ff"  "ffdf1664b17c18654.ff" 
##  [67] "ffdf1664b23c24fe0.ff"  "ffdf1664b24c84c7c.ff"  "ffdf1664b25a0763c.ff" 
##  [70] "ffdf1664b2663b0d0.ff"  "ffdf1664b283091fe.ff"  "ffdf1664b2b44b5a4.ff" 
##  [73] "ffdf1664b2c262d7d.ff"  "ffdf1664b2cd1a62.ff"   "ffdf1664b2e3a4c4e.ff" 
##  [76] "ffdf1664b2ed055dd.ff"  "ffdf1664b3054a5d7.ff"  "ffdf1664b3078f7ce.ff" 
##  [79] "ffdf1664b310e3e80.ff"  "ffdf1664b334c94f4.ff"  "ffdf1664b34d11719.ff" 
##  [82] "ffdf1664b388c4b44.ff"  "ffdf1664b39224dcd.ff"  "ffdf1664b39a5a878.ff" 
##  [85] "ffdf1664b39c6c3b6.ff"  "ffdf1664b3af201e8.ff"  "ffdf1664b3b734f1b.ff" 
##  [88] "ffdf1664b3c340dc7.ff"  "ffdf1664b3da147d5.ff"  "ffdf1664b3dd310eb.ff" 
##  [91] "ffdf1664b3ea58fa8.ff"  "ffdf1664b40a69eb6.ff"  "ffdf1664b40aa6c4e.ff" 
##  [94] "ffdf1664b438a096c.ff"  "ffdf1664b44f516e4.ff"  "ffdf1664b4805a867.ff" 
##  [97] "ffdf1664b49ebec96.ff"  "ffdf1664b4a28a649.ff"  "ffdf1664b4ae45ee8.ff" 
## [100] "ffdf1664b4d469edf.ff"  "ffdf1664b4f073478.ff"  "ffdf1664b4f225206.ff" 
## [103] "ffdf1664b504b5000.ff"  "ffdf1664b531499e6.ff"  "ffdf1664b542fe096.ff" 
## [106] "ffdf1664b549b05d6.ff"  "ffdf1664b5795f4c9.ff"  "ffdf1664b58ba26cb.ff" 
## [109] "ffdf1664b5b588fab.ff"  "ffdf1664b5bba6e30.ff"  "ffdf1664b5be03fcf.ff" 
## [112] "ffdf1664b5c6b41d2.ff"  "ffdf1664b5dbcaae3.ff"  "ffdf1664b6443be89.ff" 
## [115] "ffdf1664b64afc395.ff"  "ffdf1664b656e654e.ff"  "ffdf1664b65e821c7.ff" 
## [118] "ffdf1664b68051f3.ff"   "ffdf1664b682d3c68.ff"  "ffdf1664b68dc5ac9.ff" 
## [121] "ffdf1664b69265251.ff"  "ffdf1664b6c5d23e1.ff"  "ffdf1664b70b69393.ff" 
## [124] "ffdf1664b70dab41.ff"   "ffdf1664b749659de.ff"  "ffdf1664b76055d56.ff" 
## [127] "ffdf1664b7679060b.ff"  "ffdf1664b793471bd.ff"  "ffdf1664b799c228f.ff" 
## [130] "ffdf1664b7a45795.ff"   "ffdf1664b7a9c6cf4.ff"  "ffdf1664b7c83bd2f.ff" 
## [133] "ffdf1664b7f9ba27b.ff"  "ffdf1664bb4bf99c.ff"   "ffdf1664bc7e093a.ff"  
## [136] "ffdf1664bd3e0371.ff"   "ffdf1664bfac865c.ff"   "ffdf1ace210b3793b.ff" 
## [139] "ffdf1ace2152aa26c.ff"  "ffdf1ace21690c3a4.ff"  "ffdf1ace21cf46f6a.ff" 
## [142] "ffdf1ace22037e706.ff"  "ffdf1ace227c71e57.ff"  "ffdf1ace22a1bec13.ff" 
## [145] "ffdf1ace22d643bed.ff"  "ffdf1ace249c4aee4.ff"  "ffdf1ace25ea84ebf.ff" 
## [148] "ffdf1ace25f0ff85a.ff"  "ffdf1ace260487a3d.ff"  "ffdf1ace260f82c73.ff" 
## [151] "ffdf1ace264287025.ff"  "ffdf1ace2713ff6e3.ff"  "ffdf1ace296fe013.ff"  
## [154] "ffdf1ace29d0dcd.ff"    "ffdf1ace2bb0a2cc.ff"   "ffdf1ace2e4274e3.ff"  
## [157] "ffdf1c45014d3a58b.ff"  "ffdf1c45016fface9.ff"  "ffdf1c4501c1e8f51.ff" 
## [160] "ffdf1c450283e6c31.ff"  "ffdf1c450332f29d8.ff"  "ffdf1c45036686e35.ff" 
## [163] "ffdf1c4503a24d5d6.ff"  "ffdf1c4503d4fde96.ff"  "ffdf1c45040612509.ff" 
## [166] "ffdf1c4504f84eb98.ff"  "ffdf1c4505346df0e.ff"  "ffdf1c450543a53b5.ff" 
## [169] "ffdf1c4506ff73ad0.ff"  "ffdf1c4507014e9ab.ff"  "ffdf1c45071639c64.ff" 
## [172] "ffdf1c4507932b10a.ff"  "ffdf1c4507a88a181.ff"  "ffdf1c4507d58d596.ff" 
## [175] "ffdf1c450f24806.ff"    "ffdf1e70111ed2bfd.ff"  "ffdf1e701134419ba.ff" 
## [178] "ffdf1e70113a93127.ff"  "ffdf1e7011454351.ff"   "ffdf1e701149cfd8f.ff" 
## [181] "ffdf1e701163ec7ca.ff"  "ffdf1e70118df6828.ff"  "ffdf1e70119d9c8f2.ff" 
## [184] "ffdf1e7011ab2cb13.ff"  "ffdf1e7011c6f1ef2.ff"  "ffdf1e7011ff3b5ce.ff" 
## [187] "ffdf1e7012080598c.ff"  "ffdf1e701216aec13.ff"  "ffdf1e70121c294e4.ff" 
## [190] "ffdf1e70123667b.ff"    "ffdf1e701242bc3aa.ff"  "ffdf1e701245cdace.ff" 
## [193] "ffdf1e70124ff7829.ff"  "ffdf1e7012540c78a.ff"  "ffdf1e701255f11f.ff"  
## [196] "ffdf1e70126d0c7cf.ff"  "ffdf1e70127b1bd84.ff"  "ffdf1e7012ac93ad1.ff" 
## [199] "ffdf1e7012b2455b.ff"   "ffdf1e7012b78f8e4.ff"  "ffdf1e7012d2036a1.ff" 
## [202] "ffdf1e70130e688f5.ff"  "ffdf1e7013133e1b9.ff"  "ffdf1e701321f514.ff"  
## [205] "ffdf1e70133e5f9e0.ff"  "ffdf1e70134c315e7.ff"  "ffdf1e70137a512e6.ff" 
## [208] "ffdf1e70138ea5638.ff"  "ffdf1e701391bbb2d.ff"  "ffdf1e7013923bfae.ff" 
## [211] "ffdf1e7013a1a516b.ff"  "ffdf1e7013a79f8f7.ff"  "ffdf1e7013b75b97a.ff" 
## [214] "ffdf1e7013b9685d.ff"   "ffdf1e7013cf0d95f.ff"  "ffdf1e7013d24a714.ff" 
## [217] "ffdf1e7013dcced9.ff"   "ffdf1e70141dfe87b.ff"  "ffdf1e701441f7978.ff" 
## [220] "ffdf1e7014450909c.ff"  "ffdf1e701459ec99b.ff"  "ffdf1e70145c0001a.ff" 
## [223] "ffdf1e70146939527.ff"  "ffdf1e70146b76ea5.ff"  "ffdf1e70147035c6e.ff" 
## [226] "ffdf1e701475ec4b9.ff"  "ffdf1e701489c06a5.ff"  "ffdf1e70149ec423d.ff" 
## [229] "ffdf1e7014a6bfd58.ff"  "ffdf1e7014adee59d.ff"  "ffdf1e7014cd3d712.ff" 
## [232] "ffdf1e7014cf73d8d.ff"  "ffdf1e7014d159c00.ff"  "ffdf1e7014def2f74.ff" 
## [235] "ffdf1e7014e839a34.ff"  "ffdf1e70150d40c66.ff"  "ffdf1e701524f51d4.ff" 
## [238] "ffdf1e70153e79795.ff"  "ffdf1e701547987a6.ff"  "ffdf1e70157dde213.ff" 
## [241] "ffdf1e7015b4a2185.ff"  "ffdf1e7015c57a1a9.ff"  "ffdf1e7015f4e3692.ff" 
## [244] "ffdf1e70160b3e23c.ff"  "ffdf1e7016123f92.ff"   "ffdf1e70166d10680.ff" 
## [247] "ffdf1e701670e9b46.ff"  "ffdf1e70167d4beb6.ff"  "ffdf1e701684bd07d.ff" 
## [250] "ffdf1e701692f8ef.ff"   "ffdf1e70169cb0e10.ff"  "ffdf1e7016b231d89.ff" 
## [253] "ffdf1e7017025d87a.ff"  "ffdf1e70172061165.ff"  "ffdf1e701720dc7a.ff"  
## [256] "ffdf1e70173bbda5c.ff"  "ffdf1e701750bc1b6.ff"  "ffdf1e70175cd4641.ff" 
## [259] "ffdf1e701799b79a.ff"   "ffdf1e7017b68a809.ff"  "ffdf1e7017d502e6e.ff" 
## [262] "ffdf1e7017e3e246d.ff"  "ffdf1e7017f9038b8.ff"  "ffdf1e7018551842.ff"  
## [265] "ffdf1e7018700a14.ff"   "ffdf1e7018c39a1a.ff"   "ffdf1e7019003a51.ff"  
## [268] "ffdf1e7019b4ee03.ff"   "ffdf1e701b460723.ff"   "ffdf1e701e16335e.ff"  
## [271] "ffdf1e701f9706f9.ff"   "ffdf1e701ffd8e01.ff"   "ffdf2aea211a9d2f3.ff" 
## [274] "ffdf2aea2125f9beb.ff"  "ffdf2aea212c50ad8.ff"  "ffdf2aea21373d31.ff"  
## [277] "ffdf2aea21396d91c.ff"  "ffdf2aea215449e20.ff"  "ffdf2aea2159afc27.ff" 
## [280] "ffdf2aea2197d5dd6.ff"  "ffdf2aea21a497ec5.ff"  "ffdf2aea21b4925e6.ff" 
## [283] "ffdf2aea21bacebae.ff"  "ffdf2aea21be5d36.ff"   "ffdf2aea21c6d8ece.ff" 
## [286] "ffdf2aea21dd0b799.ff"  "ffdf2aea22005aa3.ff"   "ffdf2aea2212708e2.ff" 
## [289] "ffdf2aea221413bc8.ff"  "ffdf2aea221d20435.ff"  "ffdf2aea22271ba5f.ff" 
## [292] "ffdf2aea22444718e.ff"  "ffdf2aea227aebdf1.ff"  "ffdf2aea227c003a0.ff" 
## [295] "ffdf2aea2289f4e92.ff"  "ffdf2aea229991fa4.ff"  "ffdf2aea22df3ec18.ff" 
## [298] "ffdf2aea2304f7e9d.ff"  "ffdf2aea230e4220d.ff"  "ffdf2aea233f9fcd5.ff" 
## [301] "ffdf2aea23458da4b.ff"  "ffdf2aea2363d3988.ff"  "ffdf2aea2364b497f.ff" 
## [304] "ffdf2aea237557cde.ff"  "ffdf2aea237d9f5d.ff"   "ffdf2aea238310f44.ff" 
## [307] "ffdf2aea2393123c2.ff"  "ffdf2aea23b7087a8.ff"  "ffdf2aea23d70e24b.ff" 
## [310] "ffdf2aea23ecbaf8e.ff"  "ffdf2aea23edf492e.ff"  "ffdf2aea23fae902.ff"  
## [313] "ffdf2aea241d9c01.ff"   "ffdf2aea244314c61.ff"  "ffdf2aea244597972.ff" 
## [316] "ffdf2aea24670062c.ff"  "ffdf2aea24733841d.ff"  "ffdf2aea24790d5f2.ff" 
## [319] "ffdf2aea24b0b1425.ff"  "ffdf2aea24b1cf07a.ff"  "ffdf2aea24bae71f3.ff" 
## [322] "ffdf2aea24f1ab53e.ff"  "ffdf2aea250d919c6.ff"  "ffdf2aea2529776fd.ff" 
## [325] "ffdf2aea252aa7d3.ff"   "ffdf2aea252ae8199.ff"  "ffdf2aea254f9c0ef.ff" 
## [328] "ffdf2aea25523e464.ff"  "ffdf2aea2553ce582.ff"  "ffdf2aea256a96a9b.ff" 
## [331] "ffdf2aea257dc3698.ff"  "ffdf2aea25cf2cba3.ff"  "ffdf2aea25da0b7d4.ff" 
## [334] "ffdf2aea25dc25075.ff"  "ffdf2aea2600ceb56.ff"  "ffdf2aea26199bed3.ff" 
## [337] "ffdf2aea266305546.ff"  "ffdf2aea266cd72ff.ff"  "ffdf2aea268cf74c3.ff" 
## [340] "ffdf2aea26ceef4af.ff"  "ffdf2aea26dff0410.ff"  "ffdf2aea26f4f85b3.ff" 
## [343] "ffdf2aea27042e6d5.ff"  "ffdf2aea27086c2e4.ff"  "ffdf2aea273892246.ff" 
## [346] "ffdf2aea27516f90d.ff"  "ffdf2aea27818cedf.ff"  "ffdf2aea27c3096a1.ff" 
## [349] "ffdf2aea2bfa0b8b.ff"   "ffdf2aea2e7c30f0.ff"   "ffdf2d4961013983d.ff" 
## [352] "ffdf2d496106d71ed.ff"  "ffdf2d49610dcc011.ff"  "ffdf2d49611582295.ff" 
## [355] "ffdf2d49612fd0385.ff"  "ffdf2d49613d4f6ed.ff"  "ffdf2d49613d96f9e.ff" 
## [358] "ffdf2d496169a9f59.ff"  "ffdf2d49616c18900.ff"  "ffdf2d49616c34633.ff" 
## [361] "ffdf2d496183560e5.ff"  "ffdf2d49619378a62.ff"  "ffdf2d4961a902ea8.ff" 
## [364] "ffdf2d4961c9263f0.ff"  "ffdf2d4961ca2ccc2.ff"  "ffdf2d4961d62507b.ff" 
## [367] "ffdf2d4961e2276d7.ff"  "ffdf2d4961f34f0f1.ff"  "ffdf2d496203a0c22.ff" 
## [370] "ffdf2d49620967a59.ff"  "ffdf2d49620b5ab8d.ff"  "ffdf2d49621c4e92e.ff" 
## [373] "ffdf2d4962278efad.ff"  "ffdf2d49623c14951.ff"  "ffdf2d4962761eb91.ff" 
## [376] "ffdf2d49627f5aeee.ff"  "ffdf2d496283b8a90.ff"  "ffdf2d4962a0dcff6.ff" 
## [379] "ffdf2d4962d54e8ec.ff"  "ffdf2d4962db10d9.ff"   "ffdf2d4962db30dc9.ff" 
## [382] "ffdf2d4962edfebd1.ff"  "ffdf2d4962ef6e9e5.ff"  "ffdf2d4962fe11818.ff" 
## [385] "ffdf2d496321781a7.ff"  "ffdf2d49636444aec.ff"  "ffdf2d4963657f6ef.ff" 
## [388] "ffdf2d4963841b710.ff"  "ffdf2d49638697848.ff"  "ffdf2d49638ef0169.ff" 
## [391] "ffdf2d4963b84c621.ff"  "ffdf2d4963cab4dbb.ff"  "ffdf2d4963d316f51.ff" 
## [394] "ffdf2d4963d352298.ff"  "ffdf2d4963f2376c9.ff"  "ffdf2d496406db597.ff" 
## [397] "ffdf2d496408484c8.ff"  "ffdf2d496412d3950.ff"  "ffdf2d496414f5bd2.ff" 
## [400] "ffdf2d4964267fd5d.ff"  "ffdf2d49642f80dd.ff"   "ffdf2d496434537a9.ff" 
## [403] "ffdf2d496479bf7b3.ff"  "ffdf2d4964954fa74.ff"  "ffdf2d4964a1ead2.ff"  
## [406] "ffdf2d4964c0cbd8d.ff"  "ffdf2d4964c7f709.ff"   "ffdf2d4964ea6637a.ff" 
## [409] "ffdf2d4964f8d643f.ff"  "ffdf2d4964f98cd4a.ff"  "ffdf2d49653c84e28.ff" 
## [412] "ffdf2d49654476e8d.ff"  "ffdf2d49655f252b9.ff"  "ffdf2d49656a8e37e.ff" 
## [415] "ffdf2d496576bb953.ff"  "ffdf2d4965770d8d7.ff"  "ffdf2d496577e7084.ff" 
## [418] "ffdf2d4965923f538.ff"  "ffdf2d4965b15d937.ff"  "ffdf2d4965c666042.ff" 
## [421] "ffdf2d4965d108259.ff"  "ffdf2d4965d77957b.ff"  "ffdf2d49661628a1c.ff" 
## [424] "ffdf2d4966271d843.ff"  "ffdf2d49663851f6d.ff"  "ffdf2d49664cbe6c5.ff" 
## [427] "ffdf2d496654102d3.ff"  "ffdf2d4966608f1e5.ff"  "ffdf2d4966756073.ff"  
## [430] "ffdf2d496681c13ac.ff"  "ffdf2d49669105fe2.ff"  "ffdf2d4966a82531c.ff" 
## [433] "ffdf2d4966e81d93e.ff"  "ffdf2d4966f3bba34.ff"  "ffdf2d4966f7ce981.ff" 
## [436] "ffdf2d4967323c411.ff"  "ffdf2d49674325791.ff"  "ffdf2d4967449a4e4.ff" 
## [439] "ffdf2d496758840d9.ff"  "ffdf2d49679b3795a.ff"  "ffdf2d4967e0556de.ff" 
## [442] "ffdf2d4967f407c47.ff"  "ffdf2d496b894f8f.ff"   "ffdf2d496c7ff2bf.ff"  
## [445] "ffdf2d496d1db0f.ff"    "ffdf2d496d97ad.ff"     "ffdf2d496fcc9a35.ff"  
## [448] "ffdf30811108aca03.ff"  "ffdf3081110a2897a.ff"  "ffdf30811132d8914.ff" 
## [451] "ffdf3081113cc82fa.ff"  "ffdf3081115ee8d85.ff"  "ffdf30811160d41ca.ff" 
## [454] "ffdf30811186296fd.ff"  "ffdf308111a495892.ff"  "ffdf308111adc7fb.ff"  
## [457] "ffdf308111bbe474.ff"   "ffdf308111bfe70d2.ff"  "ffdf308111c40313.ff"  
## [460] "ffdf308111e0ac89e.ff"  "ffdf308111f654ff4.ff"  "ffdf30811203f7ca6.ff" 
## [463] "ffdf308112088d10f.ff"  "ffdf30811208e1bbb.ff"  "ffdf3081120bfd9a3.ff" 
## [466] "ffdf30811212d537d.ff"  "ffdf3081121c035c5.ff"  "ffdf3081122b530ec.ff" 
## [469] "ffdf3081122ff9dba.ff"  "ffdf3081124a262c6.ff"  "ffdf30811260536ef.ff" 
## [472] "ffdf30811270c4434.ff"  "ffdf30811283531d.ff"   "ffdf308112a51272d.ff" 
## [475] "ffdf308112ab5913c.ff"  "ffdf308112b71239b.ff"  "ffdf308112e01d29d.ff" 
## [478] "ffdf308112f3c6ac8.ff"  "ffdf308113036acd8.ff"  "ffdf3081135e2ba00.ff" 
## [481] "ffdf3081135f99c0e.ff"  "ffdf30811372c1d8d.ff"  "ffdf308113764ec6e.ff" 
## [484] "ffdf308113770abf6.ff"  "ffdf3081138e8b18e.ff"  "ffdf308113a3663.ff"   
## [487] "ffdf308113a7607bb.ff"  "ffdf308113ac1f056.ff"  "ffdf308113c06487.ff"  
## [490] "ffdf308113c7a3610.ff"  "ffdf308113e24ad23.ff"  "ffdf3081142901c05.ff" 
## [493] "ffdf3081142a88c8.ff"   "ffdf308114335eae6.ff"  "ffdf308114407b2ba.ff" 
## [496] "ffdf3081146f8991c.ff"  "ffdf308114a668070.ff"  "ffdf308114cec2bd5.ff" 
## [499] "ffdf3081150336648.ff"  "ffdf3081150b70389.ff"  "ffdf3081153203ca4.ff" 
## [502] "ffdf308115351cafa.ff"  "ffdf308115646e279.ff"  "ffdf3081157122f82.ff" 
## [505] "ffdf3081157819b0c.ff"  "ffdf3081157fc67e.ff"   "ffdf308115a6d4a42.ff" 
## [508] "ffdf308115b71b912.ff"  "ffdf308115d05f56.ff"   "ffdf308115d51fa62.ff" 
## [511] "ffdf308115fe5333c.ff"  "ffdf30811603aab49.ff"  "ffdf3081160db552c.ff" 
## [514] "ffdf30811649083c9.ff"  "ffdf3081167ad915e.ff"  "ffdf3081168393266.ff" 
## [517] "ffdf308116960c46a.ff"  "ffdf30811699bd8a.ff"   "ffdf308116a5b4957.ff" 
## [520] "ffdf308116c091152.ff"  "ffdf308116cdaf1ff.ff"  "ffdf308116eb2f73.ff"  
## [523] "ffdf308116fff319b.ff"  "ffdf30811731eac78.ff"  "ffdf30811735f5a1a.ff" 
## [526] "ffdf30811736c97be.ff"  "ffdf3081173a6ce21.ff"  "ffdf30811756d6ceb.ff" 
## [529] "ffdf30811757f260f.ff"  "ffdf308117649bf11.ff"  "ffdf3081177a2532b.ff" 
## [532] "ffdf308117807183f.ff"  "ffdf308117879cdd8.ff"  "ffdf308117a32190.ff"  
## [535] "ffdf308117ad11d55.ff"  "ffdf308117cb8c58d.ff"  "ffdf308117dbf4130.ff" 
## [538] "ffdf308117f75ea93.ff"  "ffdf308117f828399.ff"  "ffdf308119f10c79.ff"  
## [541] "ffdf30811d1a6ea5.ff"   "ffdf30811dc94717.ff"   "ffdf30811f5dcaec.ff"  
## [544] "ffdf30811fa84ed.ff"    "ffdf32d6fb1077e9a7.ff" "ffdf32d6fb12668a53.ff"
## [547] "ffdf32d6fb1c099ce6.ff" "ffdf32d6fb232e04c5.ff" "ffdf32d6fb38961624.ff"
## [550] "ffdf32d6fb3b3fea59.ff" "ffdf32d6fb3d058d34.ff" "ffdf32d6fb3e37a088.ff"
## [553] "ffdf32d6fb3f12402f.ff" "ffdf32d6fb4f98984.ff"  "ffdf32d6fb4f9a19c3.ff"
## [556] "ffdf32d6fb50076313.ff" "ffdf32d6fb52c743ca.ff" "ffdf32d6fb52fd9b5c.ff"
## [559] "ffdf32d6fb55647d4f.ff" "ffdf32d6fb591ce210.ff" "ffdf32d6fb5f23ddde.ff"
## [562] "ffdf32d6fb627e13fa.ff" "ffdf32d6fb675729e0.ff" "ffdf399cd1366b246.ff" 
## [565] "ffdf399cd14418c29.ff"  "ffdf399cd18c9d73f.ff"  "ffdf399cd18f6fcc3.ff" 
## [568] "ffdf399cd1a33942.ff"   "ffdf399cd1af253b3.ff"  "ffdf399cd1b2b9eb4.ff" 
## [571] "ffdf399cd1badd7a.ff"   "ffdf399cd1c4aa49d.ff"  "ffdf399cd1c59c49b.ff" 
## [574] "ffdf399cd20320730.ff"  "ffdf399cd204c5f1f.ff"  "ffdf399cd2131e5d9.ff" 
## [577] "ffdf399cd23c7da78.ff"  "ffdf399cd251ecfd4.ff"  "ffdf399cd27172371.ff" 
## [580] "ffdf399cd27a83e6f.ff"  "ffdf399cd28a54d72.ff"  "ffdf399cd29091ee3.ff" 
## [583] "ffdf399cd29481fdf.ff"  "ffdf399cd294b77b1.ff"  "ffdf399cd2a039f63.ff" 
## [586] "ffdf399cd2a4e582a.ff"  "ffdf399cd2be2be8c.ff"  "ffdf399cd2cf7a043.ff" 
## [589] "ffdf399cd2de374dd.ff"  "ffdf399cd2e40c7eb.ff"  "ffdf399cd2eb438f5.ff" 
## [592] "ffdf399cd2f15fcc1.ff"  "ffdf399cd2f37ff6a.ff"  "ffdf399cd31044811.ff" 
## [595] "ffdf399cd314684e2.ff"  "ffdf399cd367a904b.ff"  "ffdf399cd37d6535a.ff" 
## [598] "ffdf399cd39455ff6.ff"  "ffdf399cd39a4a3eb.ff"  "ffdf399cd3b904810.ff" 
## [601] "ffdf399cd3cd6b830.ff"  "ffdf399cd3eec2014.ff"  "ffdf399cd3f07bdd3.ff" 
## [604] "ffdf399cd3f501710.ff"  "ffdf399cd40beeebc.ff"  "ffdf399cd42d4a82b.ff" 
## [607] "ffdf399cd44198b9c.ff"  "ffdf399cd444e2b61.ff"  "ffdf399cd447e5438.ff" 
## [610] "ffdf399cd46a34118.ff"  "ffdf399cd46bf5311.ff"  "ffdf399cd4c11e588.ff" 
## [613] "ffdf399cd4cf5144c.ff"  "ffdf399cd4da998c5.ff"  "ffdf399cd4e8d23d0.ff" 
## [616] "ffdf399cd4f36c4a3.ff"  "ffdf399cd4f91aef8.ff"  "ffdf399cd5052cfa1.ff" 
## [619] "ffdf399cd516ce3fe.ff"  "ffdf399cd543017f6.ff"  "ffdf399cd5887665b.ff" 
## [622] "ffdf399cd589f7fd5.ff"  "ffdf399cd5bc24f40.ff"  "ffdf399cd5c32d383.ff" 
## [625] "ffdf399cd5c9a4e6b.ff"  "ffdf399cd5e3e2525.ff"  "ffdf399cd5e635182.ff" 
## [628] "ffdf399cd62a4dec7.ff"  "ffdf399cd6307479.ff"   "ffdf399cd63b04978.ff" 
## [631] "ffdf399cd63f6fc.ff"    "ffdf399cd679a3c02.ff"  "ffdf399cd67aaf6df.ff" 
## [634] "ffdf399cd68531ea.ff"   "ffdf399cd6b17801b.ff"  "ffdf399cd6de42291.ff" 
## [637] "ffdf399cd6eeb06eb.ff"  "ffdf399cd70fa07c6.ff"  "ffdf399cd71e74984.ff" 
## [640] "ffdf399cd735dfbd2.ff"  "ffdf399cd7487dca4.ff"  "ffdf399cd74c0bc36.ff" 
## [643] "ffdf399cd76c6823f.ff"  "ffdf399cd772a793b.ff"  "ffdf399cd7759d0f8.ff" 
## [646] "ffdf399cd77ebbac1.ff"  "ffdf399cd787d7820.ff"  "ffdf399cd7969c3d9.ff" 
## [649] "ffdf399cd7ae083b2.ff"  "ffdf399cd7b1e18cb.ff"  "ffdf399cd7dbe4fa9.ff" 
## [652] "ffdf399cd7e33b759.ff"  "ffdf399cd94bda23.ff"   "ffdf399cdbe12a52.ff"  
## [655] "ffdf399cdc27d311.ff"   "ffdf399cdc5848a5.ff"   "ffdf399cdda56e6.ff"   
## [658] "ffdf399cde227f3b.ff"   "ffdf399cdf3105e8.ff"   "ffdf399cdf8c509f.ff"  
## [661] "ffdf3c3ef1075cd1a.ff"  "ffdf3c3ef136180ac.ff"  "ffdf3c3ef1588a8c8.ff" 
## [664] "ffdf3c3ef16f898bc.ff"  "ffdf3c3ef18449e8.ff"   "ffdf3c3ef188c5f39.ff" 
## [667] "ffdf3c3ef1e0f0c60.ff"  "ffdf3c3ef1f79e2ca.ff"  "ffdf3c3ef208e0d.ff"   
## [670] "ffdf3c3ef22a0b6b7.ff"  "ffdf3c3ef26980416.ff"  "ffdf3c3ef269bb5de.ff" 
## [673] "ffdf3c3ef26e416f1.ff"  "ffdf3c3ef28760c8.ff"   "ffdf3c3ef2b1522c.ff"  
## [676] "ffdf3c3ef2b5525b2.ff"  "ffdf3c3ef2bcc30e4.ff"  "ffdf3c3ef2c3d18ea.ff" 
## [679] "ffdf3c3ef2cd38bf8.ff"  "ffdf3c3ef2d10ad28.ff"  "ffdf3c3ef2d635931.ff" 
## [682] "ffdf3c3ef2dd4c753.ff"  "ffdf3c3ef2f15d6b0.ff"  "ffdf3c3ef30bea03a.ff" 
## [685] "ffdf3c3ef330688e1.ff"  "ffdf3c3ef33aba7a4.ff"  "ffdf3c3ef3550edb1.ff" 
## [688] "ffdf3c3ef3596f210.ff"  "ffdf3c3ef35cdc52.ff"   "ffdf3c3ef36c9b940.ff" 
## [691] "ffdf3c3ef37e172d6.ff"  "ffdf3c3ef3b7b2f2b.ff"  "ffdf3c3ef3ccb5ab4.ff" 
## [694] "ffdf3c3ef3d1249c2.ff"  "ffdf3c3ef3d7d4b12.ff"  "ffdf3c3ef3e299f31.ff" 
## [697] "ffdf3c3ef3eac0e2e.ff"  "ffdf3c3ef3f6c3fb1.ff"  "ffdf3c3ef4012f7a3.ff" 
## [700] "ffdf3c3ef4306d392.ff"  "ffdf3c3ef44935bc4.ff"  "ffdf3c3ef46091abd.ff" 
## [703] "ffdf3c3ef4963a438.ff"  "ffdf3c3ef4a38a6fd.ff"  "ffdf3c3ef4e8fb035.ff" 
## [706] "ffdf3c3ef4eedf47a.ff"  "ffdf3c3ef52240272.ff"  "ffdf3c3ef53d5d916.ff" 
## [709] "ffdf3c3ef5476ca9c.ff"  "ffdf3c3ef54cd71b3.ff"  "ffdf3c3ef54f4e879.ff" 
## [712] "ffdf3c3ef55a991b8.ff"  "ffdf3c3ef562ce991.ff"  "ffdf3c3ef5656c9d2.ff" 
## [715] "ffdf3c3ef56d7486b.ff"  "ffdf3c3ef573fc17f.ff"  "ffdf3c3ef57fed162.ff" 
## [718] "ffdf3c3ef581f5f6f.ff"  "ffdf3c3ef5832407e.ff"  "ffdf3c3ef595ebf51.ff" 
## [721] "ffdf3c3ef5a0b1371.ff"  "ffdf3c3ef5c9e09e1.ff"  "ffdf3c3ef5cfaa16.ff"  
## [724] "ffdf3c3ef5de9bede.ff"  "ffdf3c3ef5e43d1a.ff"   "ffdf3c3ef5ed9fd97.ff" 
## [727] "ffdf3c3ef5f99e325.ff"  "ffdf3c3ef605c3cf4.ff"  "ffdf3c3ef61fc478d.ff" 
## [730] "ffdf3c3ef641e8bc0.ff"  "ffdf3c3ef6478658b.ff"  "ffdf3c3ef649510d4.ff" 
## [733] "ffdf3c3ef64a9a7ad.ff"  "ffdf3c3ef67094788.ff"  "ffdf3c3ef672b3fb4.ff" 
## [736] "ffdf3c3ef67cdbefe.ff"  "ffdf3c3ef697af01.ff"   "ffdf3c3ef6f2d07e7.ff" 
## [739] "ffdf3c3ef6fc296a4.ff"  "ffdf3c3ef71471665.ff"  "ffdf3c3ef71477fde.ff" 
## [742] "ffdf3c3ef743bf5f1.ff"  "ffdf3c3ef7585f891.ff"  "ffdf3c3ef76768fce.ff" 
## [745] "ffdf3c3ef780b285a.ff"  "ffdf3c3ef7b5f20f.ff"   "ffdf3c3ef7ba6ea43.ff" 
## [748] "ffdf3c3ef7c8a8d30.ff"  "ffdf3c3ef7cae1244.ff"  "ffdf3c3ef7fb79eed.ff" 
## [751] "ffdf3c3ef7fd61fbf.ff"  "ffdf3c3ef96cf994.ff"   "ffdf3c3ef9a9fe3f.ff"  
## [754] "ffdf3c3ef9b8c347.ff"   "ffdf3c3efa12dfbd.ff"   "ffdf3c3efc1541f2.ff"  
## [757] "ffdf3c3efd91d696.ff"   "ffdf3f8e91124f47a.ff"  "ffdf3f8e911bec726.ff" 
## [760] "ffdf3f8e91326081d.ff"  "ffdf3f8e91372f458.ff"  "ffdf3f8e9154e1d3f.ff" 
## [763] "ffdf3f8e9170e6179.ff"  "ffdf3f8e917ed3575.ff"  "ffdf3f8e917fe3429.ff" 
## [766] "ffdf3f8e91c57d250.ff"  "ffdf3f8e91cb01fad.ff"  "ffdf3f8e91d3394d6.ff" 
## [769] "ffdf3f8e91db789b5.ff"  "ffdf3f8e91e23612b.ff"  "ffdf3f8e92150e277.ff" 
## [772] "ffdf3f8e921946eee.ff"  "ffdf3f8e92340d568.ff"  "ffdf3f8e92a34c671.ff" 
## [775] "ffdf3f8e92b9cf6cc.ff"  "ffdf3f8e92baaa033.ff"  "ffdf3f8e92d3bf2ef.ff" 
## [778] "ffdf3f8e930599cf3.ff"  "ffdf3f8e93221ca78.ff"  "ffdf3f8e9327fc53.ff"  
## [781] "ffdf3f8e934ba770b.ff"  "ffdf3f8e934dd3908.ff"  "ffdf3f8e9362de3de.ff" 
## [784] "ffdf3f8e9377a6f5d.ff"  "ffdf3f8e93c220b08.ff"  "ffdf3f8e93c4a540c.ff" 
## [787] "ffdf3f8e93e2e801b.ff"  "ffdf3f8e93e3a35ff.ff"  "ffdf3f8e93f49cc9d.ff" 
## [790] "ffdf3f8e9408834e4.ff"  "ffdf3f8e940a59e75.ff"  "ffdf3f8e9417608b1.ff" 
## [793] "ffdf3f8e942a61556.ff"  "ffdf3f8e946c6a5fa.ff"  "ffdf3f8e947424ba1.ff" 
## [796] "ffdf3f8e947f6aece.ff"  "ffdf3f8e9493a1978.ff"  "ffdf3f8e949eb4e9c.ff" 
## [799] "ffdf3f8e94aa6af3e.ff"  "ffdf3f8e94cdc8c38.ff"  "ffdf3f8e94deeaaf7.ff" 
## [802] "ffdf3f8e94e1b1953.ff"  "ffdf3f8e9510eea5d.ff"  "ffdf3f8e9554037f6.ff" 
## [805] "ffdf3f8e955f9dfaf.ff"  "ffdf3f8e956994746.ff"  "ffdf3f8e958859d90.ff" 
## [808] "ffdf3f8e9590112c7.ff"  "ffdf3f8e95904d39c.ff"  "ffdf3f8e95a29077a.ff" 
## [811] "ffdf3f8e95a9c767b.ff"  "ffdf3f8e95af7109c.ff"  "ffdf3f8e95bcd0fac.ff" 
## [814] "ffdf3f8e95c61f4b3.ff"  "ffdf3f8e95d385492.ff"  "ffdf3f8e95d7aae0b.ff" 
## [817] "ffdf3f8e95f295865.ff"  "ffdf3f8e95f400dce.ff"  "ffdf3f8e96009a3ba.ff" 
## [820] "ffdf3f8e961957ead.ff"  "ffdf3f8e9627a6dc2.ff"  "ffdf3f8e96320f841.ff" 
## [823] "ffdf3f8e963952c34.ff"  "ffdf3f8e963fcfa89.ff"  "ffdf3f8e9651413fe.ff" 
## [826] "ffdf3f8e966903bc5.ff"  "ffdf3f8e9682c417b.ff"  "ffdf3f8e96a0c572f.ff" 
## [829] "ffdf3f8e96a8e5535.ff"  "ffdf3f8e96b0501be.ff"  "ffdf3f8e96b23970c.ff" 
## [832] "ffdf3f8e96c429542.ff"  "ffdf3f8e96f1be8b4.ff"  "ffdf3f8e9706de4af.ff" 
## [835] "ffdf3f8e9721a67c0.ff"  "ffdf3f8e9727bab50.ff"  "ffdf3f8e9729c4cbd.ff" 
## [838] "ffdf3f8e974755e32.ff"  "ffdf3f8e9756e7e67.ff"  "ffdf3f8e976cddc36.ff" 
## [841] "ffdf3f8e976d1b03.ff"   "ffdf3f8e976e90d59.ff"  "ffdf3f8e977f03c60.ff" 
## [844] "ffdf3f8e979bb44d4.ff"  "ffdf3f8e97aa3c17b.ff"  "ffdf3f8e97ae8f9ac.ff" 
## [847] "ffdf3f8e97b9420a9.ff"  "ffdf3f8e97f4ba7be.ff"  "ffdf3f8e97f7c0b20.ff" 
## [850] "ffdf3f8e98516648.ff"   "ffdf3f8e988959bb.ff"   "ffdf3f8e996cb77f.ff"  
## [853] "ffdf3f8e9c9b1b54.ff"   "ffdf3f8e9d99c2b9.ff"   "ffdf43245102f4f38.ff" 
## [856] "ffdf4324510390bf8.ff"  "ffdf432451113874a.ff"  "ffdf4324511af0935.ff" 
## [859] "ffdf43245122abda6.ff"  "ffdf43245126bdc18.ff"  "ffdf4324512bb9443.ff" 
## [862] "ffdf4324515ab1dfb.ff"  "ffdf43245181977c4.ff"  "ffdf432451c4aed0f.ff" 
## [865] "ffdf432451d298cb7.ff"  "ffdf432451d53aa2d.ff"  "ffdf432451eb62cdd.ff" 
## [868] "ffdf432451ec7dcf9.ff"  "ffdf4324520274fed.ff"  "ffdf43245230c6634.ff" 
## [871] "ffdf4324527089813.ff"  "ffdf43245281d6217.ff"  "ffdf4324529333865.ff" 
## [874] "ffdf43245297b1418.ff"  "ffdf432452b079a5b.ff"  "ffdf432452bb83d70.ff" 
## [877] "ffdf432452c12f40b.ff"  "ffdf432452cd6e3b0.ff"  "ffdf432452fdc336.ff"  
## [880] "ffdf43245301033ac.ff"  "ffdf43245307eac50.ff"  "ffdf43245316f6c00.ff" 
## [883] "ffdf4324532b4e538.ff"  "ffdf43245355c0e71.ff"  "ffdf432453578424c.ff" 
## [886] "ffdf4324535d85afc.ff"  "ffdf432453802162f.ff"  "ffdf4324538655031.ff" 
## [889] "ffdf432453b5f628a.ff"  "ffdf432453c194c1f.ff"  "ffdf432453cc1767e.ff" 
## [892] "ffdf4324543f47b0c.ff"  "ffdf4324544d42b2c.ff"  "ffdf4324544e1a1f9.ff" 
## [895] "ffdf43245455cde20.ff"  "ffdf432454563957.ff"   "ffdf4324546e78de.ff"  
## [898] "ffdf43245472aa6bd.ff"  "ffdf4324547bff5b6.ff"  "ffdf43245485de11a.ff" 
## [901] "ffdf4324548a66bed.ff"  "ffdf432454adc2dd8.ff"  "ffdf432454d25683a.ff" 
## [904] "ffdf432454eec3424.ff"  "ffdf432454f220e2b.ff"  "ffdf432454f2f17bb.ff" 
## [907] "ffdf432454fed3037.ff"  "ffdf432455004247d.ff"  "ffdf43245510a8086.ff" 
## [910] "ffdf4324552cbec7a.ff"  "ffdf43245535aad02.ff"  "ffdf43245550d3724.ff" 
## [913] "ffdf4324556bbd743.ff"  "ffdf43245589c0e68.ff"  "ffdf432455b2e56e2.ff" 
## [916] "ffdf43245602a53d2.ff"  "ffdf4324562c35422.ff"  "ffdf43245639a55ae.ff" 
## [919] "ffdf4324563e733a8.ff"  "ffdf43245652127ae.ff"  "ffdf432456553d9f6.ff" 
## [922] "ffdf4324567848677.ff"  "ffdf432456844c83b.ff"  "ffdf432456a7493ed.ff" 
## [925] "ffdf432456afea159.ff"  "ffdf432456b3eeb8d.ff"  "ffdf432456b5da61a.ff" 
## [928] "ffdf432456bcb0a05.ff"  "ffdf432456dfd8442.ff"  "ffdf432456e25cab6.ff" 
## [931] "ffdf432456e70bb6c.ff"  "ffdf432456ed300ad.ff"  "ffdf432456facb87c.ff" 
## [934] "ffdf43245716bfde0.ff"  "ffdf4324574525d9a.ff"  "ffdf4324576e9c3a9.ff" 
## [937] "ffdf43245780c79a1.ff"  "ffdf4324579a41a43.ff"  "ffdf4324579f633db.ff" 
## [940] "ffdf432457af1e944.ff"  "ffdf432457c15cf37.ff"  "ffdf432457d383803.ff" 
## [943] "ffdf432457e79123a.ff"  "ffdf432457f774d0b.ff"  "ffdf43245a96fe96.ff"  
## [946] "ffdf43245a9be3da.ff"   "ffdf43245ab21f14.ff"   "ffdf43245c0c48b6.ff"  
## [949] "ffdf43245ccd8e14.ff"   "ffdf43245e4a7ac7.ff"   "ffdf43245e523515.ff"  
## [952] "ffdf47e96110d24c.ff"   "ffdf47e9613f28f88.ff"  "ffdf47e9619b7e2eb.ff" 
## [955] "ffdf47e961acc398f.ff"  "ffdf47e961b581683.ff"  "ffdf47e961c3dc2f5.ff" 
## [958] "ffdf47e961df85207.ff"  "ffdf47e9620176d0f.ff"  "ffdf47e96234889e2.ff" 
## [961] "ffdf47e9625becbff.ff"  "ffdf47e9625f0c89e.ff"  "ffdf47e962adaa426.ff" 
## [964] "ffdf47e962b597ff0.ff"  "ffdf47e9630bc749d.ff"  "ffdf47e9635a0c330.ff" 
## [967] "ffdf47e9638519b6.ff"   "ffdf47e963a2578d3.ff"  "ffdf47e963bdcf601.ff" 
## [970] "ffdf47e963f8211b1.ff"  "ffdf47e96431c3e48.ff"  "ffdf47e9644c9e6f1.ff" 
## [973] "ffdf47e964af07d21.ff"  "ffdf47e964d8222b6.ff"  "ffdf47e964dce4597.ff" 
## [976] "ffdf47e9659c04ab9.ff"  "ffdf47e96617bc708.ff"  "ffdf47e96621fb277.ff" 
## [979] "ffdf47e96623ba1f.ff"   "ffdf47e96629d7733.ff"  "ffdf47e9663089f72.ff" 
## [982] "ffdf47e9663376f0f.ff"  "ffdf47e966a14c272.ff"  "ffdf47e9670d79370.ff" 
## [985] "ffdf47e9671b80770.ff"  "ffdf47e96771a7c78.ff"  "ffdf47e967f3e8e5f.ff" 
## [988] "ffdf47e96b9c8f15.ff"   "ffdf47e96e529095.ff"
# investigate the structure of the object created in the R environment
summary(flights)
##                Length Class     Mode
## year           336776 ff_vector list
## month          336776 ff_vector list
## day            336776 ff_vector list
## dep_time       336776 ff_vector list
## sched_dep_time 336776 ff_vector list
## dep_delay      336776 ff_vector list
## arr_time       336776 ff_vector list
## sched_arr_time 336776 ff_vector list
## arr_delay      336776 ff_vector list
## carrier        336776 ff_vector list
## flight         336776 ff_vector list
## tailnum        336776 ff_vector list
## origin         336776 ff_vector list
## dest           336776 ff_vector list
## air_time       336776 ff_vector list
## distance       336776 ff_vector list
## hour           336776 ff_vector list
## minute         336776 ff_vector list
## time_hour      336776 ff_vector list

Memory mapping with bigmemory

Preparations

# SET UP ----------------

# load packages
library(bigmemory)
library(biganalytics)

Memory mapping with bigmemory

Import data, inspect change in RAM.

# import the data
flights <- read.big.matrix("../data/flights.csv",
                     type="integer",
                     header=TRUE,
                     backingfile="flights.bin",
                     descriptorfile="flights.desc")

Memory mapping with bigmemory

Inspect the imported data.

summary(flights)
##                          min           max          mean           NAs
## year             2013.000000   2013.000000   2013.000000      0.000000
## month               1.000000     12.000000      6.548510      0.000000
## day                 1.000000     31.000000     15.710787      0.000000
## dep_time            1.000000   2400.000000   1349.109947   8255.000000
## sched_dep_time    106.000000   2359.000000   1344.254840      0.000000
## dep_delay         -43.000000   1301.000000     12.639070   8255.000000
## arr_time            1.000000   2400.000000   1502.054999   8713.000000
## sched_arr_time      1.000000   2359.000000   1536.380220      0.000000
## arr_delay         -86.000000   1272.000000      6.895377   9430.000000
## carrier             9.000000      9.000000      9.000000 318316.000000
## flight              1.000000   8500.000000   1971.923620      0.000000
## tailnum                                                  336776.000000
## origin                                                   336776.000000
## dest                                                     336776.000000
## air_time           20.000000    695.000000    150.686460   9430.000000
## distance           17.000000   4983.000000   1039.912604      0.000000
## hour                1.000000     23.000000     13.180247      0.000000
## minute              0.000000     59.000000     26.230100      0.000000
## time_hour        2013.000000   2014.000000   2013.000261      0.000000

Memory mapping with bigmemory

Inspect the object loaded into the R environment.

flights
## An object of class "big.matrix"
## Slot "address":
## <pointer: 0x5625328cade0>

Memory mapping with bigmemory

  • backingfile: The cache for the imported file (holds the raw data on disk).
  • descriptorfile: Metadata describing the imported data set (also on disk).

Memory mapping with bigmemory

Understanding the role of backingfile and descriptorfile.

First, import a large data set without a backing-file:

# import data and check time needed  
system.time(
     flights1 <- read.big.matrix("../data/flights.csv",
                                 header = TRUE,
                                 sep = ",",
                                 type = "integer")
)
##    user  system elapsed 
##   1.051   0.021   1.077
# import data and check memory used
mem_change(
     flights1 <- read.big.matrix("../data/flights.csv",
                                 header = TRUE,
                                 sep = ",",
                                 type = "integer")
)
## 528 B
flights1 
## An object of class "big.matrix"
## Slot "address":
## <pointer: 0x56253826a320>

Memory mapping with bigmemory

Understanding the role of backingfile and descriptorfile.

Second, import the same data set with a backing-file:

# import data and check time needed  
system.time(
     flights2 <- read.big.matrix("../data/flights.csv",
                                 header = TRUE,
                                 sep = ",",
                                 type = "integer",
                                 backingfile = "flights2.bin",
                                 descriptorfile = "flights2.desc"
                                 )
)
##    user  system elapsed 
##   1.079   0.023   1.122
# import data and check memory used
mem_change(
     flights2 <- read.big.matrix("../data/flights.csv",
                                 header = TRUE,
                                 sep = ",",
                                 type = "integer",
                                 backingfile = "flights2.bin",
                                 descriptorfile = "flights2.desc"
                                 )
)
## 528 B
flights2
## An object of class "big.matrix"
## Slot "address":
## <pointer: 0x5625316af690>

Memory mapping with bigmemory

Understanding the role of backingfile and descriptorfile.

Third, re-import the same data set with a backing-file.

# remove the loaded file
rm(flights2)

# 'load' it via the backing-file
system.time(flights2 <- attach.big.matrix("flights2.desc"))
##    user  system elapsed 
##       0       0       0
flights2
## An object of class "big.matrix"
## Slot "address":
## <pointer: 0x56253a56eed0>

Cleaning and Transformation

Typical tasks (independent of data set size)

  • Normalize/standardize.
  • Code additional variables (indicators, strings to categorical, etc.).
  • Remove, add covariates.
  • Merge data sets.
  • Set data types.

Typical workflow

  1. Import raw data.
  2. Clean/transform.
  3. Store for analysis.
    • Write to file.
    • Write to database.

Bottlenecks

  • RAM:
    • Raw data does not fit into memory.
    • Transformations enlarge RAM allocation (copying).
  • Mass Storage: Reading/Writing
  • CPU: Parsing (data types)

Data Preparation with ff

Set up

The following examples are based on Walkowiak (2016), Chapter 3.

## SET UP ------------------------

#Set working directory to the data and airline_id files.
# setwd("materials/code_book/B05396_Ch03_Code")
system("mkdir ffdf")
options(fftempdir = "ffdf")

# load packages
library(ff)
library(ffbase)
library(pryr)

# fix vars
FLIGHTS_DATA <- "../code_book/B05396_Ch03_Code/flights_sep_oct15.txt"
AIRLINES_DATA <- "../code_book/B05396_Ch03_Code/airline_id.csv"

Data import

# DATA IMPORT ------------------

# 1. Upload flights_sep_oct15.txt and airline_id.csv files from flat files. 

system.time(flights.ff <- read.table.ffdf(file=FLIGHTS_DATA,
                                          sep=",",
                                          VERBOSE=TRUE,
                                          header=TRUE,
                                          next.rows=100000,
                                          colClasses=NA))
## read.table.ffdf 1..100000 (100000)  csv-read=0.492sec ffdf-write=0.068sec
## read.table.ffdf 100001..200000 (100000)  csv-read=0.53sec ffdf-write=0.049sec
## read.table.ffdf 200001..300000 (100000)  csv-read=0.527sec ffdf-write=0.054sec
## read.table.ffdf 300001..400000 (100000)  csv-read=0.549sec ffdf-write=0.049sec
## read.table.ffdf 400001..500000 (100000)  csv-read=0.528sec ffdf-write=0.055sec
## read.table.ffdf 500001..600000 (100000)  csv-read=0.514sec ffdf-write=0.048sec
## read.table.ffdf 600001..700000 (100000)  csv-read=0.54sec ffdf-write=0.049sec
## read.table.ffdf 700001..800000 (100000)  csv-read=0.514sec ffdf-write=0.045sec
## read.table.ffdf 800001..900000 (100000)  csv-read=0.529sec ffdf-write=0.053sec
## read.table.ffdf 900001..951111 (51111)  csv-read=0.27sec ffdf-write=0.042sec
##  csv-read=4.993sec  ffdf-write=0.512sec  TOTAL=5.505sec
##    user  system elapsed 
##   5.317   0.189   5.507
airlines.ff <- read.csv.ffdf(file= AIRLINES_DATA,
                             VERBOSE=TRUE,
                             header=TRUE,
                             next.rows=100000,
                             colClasses=NA)
## read.table.ffdf 1..1607 (1607)  csv-read=0.004sec ffdf-write=0.003sec
##  csv-read=0.004sec  ffdf-write=0.003sec  TOTAL=0.007sec
# check memory used
mem_used()
## 1,026,134,768 B

Comparison with read.table

##Using read.table()
system.time(flights.table <- read.table(FLIGHTS_DATA, 
                                        sep=",",
                                        header=TRUE))
##    user  system elapsed 
##   4.864   0.160   5.028
gc()
##             used   (Mb) gc trigger   (Mb)  max used   (Mb)
## Ncells   1396911   74.7    2150240  114.9   2150240  114.9
## Vcells 136560000 1041.9  213343868 1627.7 212429714 1620.8
system.time(airlines.table <- read.csv(AIRLINES_DATA,
                                       header = TRUE))
##    user  system elapsed 
##   0.002   0.000   0.002
# check memory used
mem_used()
## 1,170,730,232 B

Inspect imported files

# 2. Inspect the ffdf objects.
## For flights.ff object:
class(flights.ff)
## [1] "ffdf"
dim(flights.ff)
## [1] 951111     28
## For airlines.ff object:
class(airlines.ff)
## [1] "ffdf"
dim(airlines.ff)
## [1] 1607    2

Data cleaning and transformation

Goal: merge airline data to flights data

# step 1: 
## Rename "Code" variable from airlines.ff to "AIRLINE_ID" and "Description" into "AIRLINE_NM".
names(airlines.ff) <- c("AIRLINE_ID", "AIRLINE_NM")
names(airlines.ff)
## [1] "AIRLINE_ID" "AIRLINE_NM"
str(airlines.ff[1:20,])
## 'data.frame':    20 obs. of  2 variables:
##  $ AIRLINE_ID: int  19031 19032 19033 19034 19035 19036 19037 19038 19039 19040 ...
##  $ AIRLINE_NM: Factor w/ 1607 levels "40-Mile Air: Q5",..: 945 1025 503 721 64 725 1194 99 1395 276 ...

Data cleaning and transformation

Goal: merge airline data to flights data

# merge of ffdf objects
mem_change(flights.data.ff <- merge.ffdf(flights.ff, airlines.ff, by="AIRLINE_ID"))
## 780 kB
class(flights.data.ff)
## [1] "ffdf"
dim(flights.data.ff)
## [1] 951111     29
dimnames(flights.data.ff)
## [[1]]
## NULL
## 
## [[2]]
##  [1] "YEAR"              "MONTH"             "DAY_OF_MONTH"      "DAY_OF_WEEK"      
##  [5] "FL_DATE"           "UNIQUE_CARRIER"    "AIRLINE_ID"        "TAIL_NUM"         
##  [9] "FL_NUM"            "ORIGIN_AIRPORT_ID" "ORIGIN"            "ORIGIN_CITY_NAME" 
## [13] "ORIGIN_STATE_NM"   "ORIGIN_WAC"        "DEST_AIRPORT_ID"   "DEST"             
## [17] "DEST_CITY_NAME"    "DEST_STATE_NM"     "DEST_WAC"          "DEP_TIME"         
## [21] "DEP_DELAY"         "ARR_TIME"          "ARR_DELAY"         "CANCELLED"        
## [25] "CANCELLATION_CODE" "DIVERTED"          "AIR_TIME"          "DISTANCE"         
## [29] "AIRLINE_NM"

Inspect difference to in-memory operation

##For flights.table:
names(airlines.table) <- c("AIRLINE_ID", "AIRLINE_NM")
names(airlines.table)
## [1] "AIRLINE_ID" "AIRLINE_NM"
str(airlines.table[1:20,])
## 'data.frame':    20 obs. of  2 variables:
##  $ AIRLINE_ID: int  19031 19032 19033 19034 19035 19036 19037 19038 19039 19040 ...
##  $ AIRLINE_NM: chr  "Mackey International Inc.: MAC" "Munz Northern Airlines Inc.: XY" "Cochise Airlines Inc.: COC" "Golden Gate Airlines Inc.: GSA" ...
# check memory usage of merge in RAM 
mem_change(flights.data.table <- merge(flights.table,
                                       airlines.table,
                                       by="AIRLINE_ID"))
## 160 MB

Subsetting

mem_used()
## 1,331,356,592 B
# Subset the ffdf object flights.data.ff:
subs1.ff <- subset.ffdf(flights.data.ff, CANCELLED == 1, 
                        select = c(FL_DATE, AIRLINE_ID, 
                                   ORIGIN_CITY_NAME,
                                   ORIGIN_STATE_NM,
                                   DEST_CITY_NAME,
                                   DEST_STATE_NM,
                                   CANCELLATION_CODE))

dim(subs1.ff)
## [1] 4529    7
mem_used()
## 1,331,639,616 B

Save to ffdf-files

(For further processing with ff)

# Save a newly created ffdf object to a data file:

save.ffdf(subs1.ff, overwrite = TRUE) #7 files (one for each column) created in the ffdb directory

Load ffdf-files

# Loading previously saved ffdf files:
rm(subs1.ff)
gc()
##             used   (Mb) gc trigger   (Mb)  max used   (Mb)
## Ncells   1417402   75.7    4481831  239.4   3298978  176.2
## Vcells 156550295 1194.4  256092641 1953.9 212429714 1620.8
load.ffdf("ffdb")
str(subs1.ff)
## List of 3
##  $ virtual: 'data.frame':    7 obs. of  7 variables:
##  .. $ VirtualVmode     : chr  "integer" "integer" "integer" "integer" ...
##  .. $ AsIs             : logi  FALSE FALSE FALSE FALSE FALSE FALSE ...
##  .. $ VirtualIsMatrix  : logi  FALSE FALSE FALSE FALSE FALSE FALSE ...
##  .. $ PhysicalIsMatrix : logi  FALSE FALSE FALSE FALSE FALSE FALSE ...
##  .. $ PhysicalElementNo: int  1 2 3 4 5 6 7
##  .. $ PhysicalFirstCol : int  1 1 1 1 1 1 1
##  .. $ PhysicalLastCol  : int  1 1 1 1 1 1 1
##  .. - attr(*, "Dim")= int [1:2] 4529 7
##  .. - attr(*, "Dimorder")= int [1:2] 1 2
##  $ physical: List of 7
##  .. $ FL_DATE          : list()
##  ..  ..- attr(*, "physical")=Class 'ff_pointer' <externalptr> 
##  ..  .. ..- attr(*, "vmode")= chr "integer"
##  ..  .. ..- attr(*, "maxlength")= int 4529
##  ..  .. ..- attr(*, "pattern")= chr "ffdf"
##  ..  .. ..- attr(*, "filename")= chr "/home/umatter/Dropbox/Teaching/HSG/BigData/BigData/materials/slides/ffdb/subs1.ff$FL_DATE.ff"
##  ..  .. ..- attr(*, "pagesize")= int 65536
##  ..  .. ..- attr(*, "finalizer")= chr "close"
##  ..  .. ..- attr(*, "finonexit")= logi TRUE
##  ..  .. ..- attr(*, "readonly")= logi FALSE
##  ..  .. ..- attr(*, "caching")= chr "mmnoflush"
##  ..  ..- attr(*, "virtual")= list()
##  ..  .. ..- attr(*, "Length")= int 4529
##  ..  .. ..- attr(*, "Symmetric")= logi FALSE
##  ..  .. ..- attr(*, "Levels")= chr [1:61] "2015-09-01" "2015-09-02" "2015-09-03" "2015-09-04" ...
##  ..  .. ..- attr(*, "ramclass")= chr "factor"
##  .. .. - attr(*, "class") =  chr [1:2] "ff_vector" "ff"
##  .. $ AIRLINE_ID       : list()
##  ..  ..- attr(*, "physical")=Class 'ff_pointer' <externalptr> 
##  ..  .. ..- attr(*, "vmode")= chr "integer"
##  ..  .. ..- attr(*, "maxlength")= int 4529
##  ..  .. ..- attr(*, "pattern")= chr "ffdf"
##  ..  .. ..- attr(*, "filename")= chr "/home/umatter/Dropbox/Teaching/HSG/BigData/BigData/materials/slides/ffdb/subs1.ff$AIRLINE_ID.ff"
##  ..  .. ..- attr(*, "pagesize")= int 65536
##  ..  .. ..- attr(*, "finalizer")= chr "close"
##  ..  .. ..- attr(*, "finonexit")= logi TRUE
##  ..  .. ..- attr(*, "readonly")= logi FALSE
##  ..  .. ..- attr(*, "caching")= chr "mmnoflush"
##  ..  ..- attr(*, "virtual")= list()
##  ..  .. ..- attr(*, "Length")= int 4529
##  ..  .. ..- attr(*, "Symmetric")= logi FALSE
##  .. .. - attr(*, "class") =  chr [1:2] "ff_vector" "ff"
##  .. $ ORIGIN_CITY_NAME : list()
##  ..  ..- attr(*, "physical")=Class 'ff_pointer' <externalptr> 
##  ..  .. ..- attr(*, "vmode")= chr "integer"
##  ..  .. ..- attr(*, "maxlength")= int 4529
##  ..  .. ..- attr(*, "pattern")= chr "ffdf"
##  ..  .. ..- attr(*, "filename")= chr "/home/umatter/Dropbox/Teaching/HSG/BigData/BigData/materials/slides/ffdb/subs1.ff$ORIGIN_CITY_NAME.ff"
##  ..  .. ..- attr(*, "pagesize")= int 65536
##  ..  .. ..- attr(*, "finalizer")= chr "close"
##  ..  .. ..- attr(*, "finonexit")= logi TRUE
##  ..  .. ..- attr(*, "readonly")= logi FALSE
##  ..  .. ..- attr(*, "caching")= chr "mmnoflush"
##  ..  ..- attr(*, "virtual")= list()
##  ..  .. ..- attr(*, "Length")= int 4529
##  ..  .. ..- attr(*, "Symmetric")= logi FALSE
##  ..  .. ..- attr(*, "Levels")= chr [1:305] "Abilene, TX" "Akron, OH" "Albany, GA" "Albany, NY" ...
##  ..  .. ..- attr(*, "ramclass")= chr "factor"
##  .. .. - attr(*, "class") =  chr [1:2] "ff_vector" "ff"
##  .. $ ORIGIN_STATE_NM  : list()
##  ..  ..- attr(*, "physical")=Class 'ff_pointer' <externalptr> 
##  ..  .. ..- attr(*, "vmode")= chr "integer"
##  ..  .. ..- attr(*, "maxlength")= int 4529
##  ..  .. ..- attr(*, "pattern")= chr "ffdf"
##  ..  .. ..- attr(*, "filename")= chr "/home/umatter/Dropbox/Teaching/HSG/BigData/BigData/materials/slides/ffdb/subs1.ff$ORIGIN_STATE_NM.ff"
##  ..  .. ..- attr(*, "pagesize")= int 65536
##  ..  .. ..- attr(*, "finalizer")= chr "close"
##  ..  .. ..- attr(*, "finonexit")= logi TRUE
##  ..  .. ..- attr(*, "readonly")= logi FALSE
##  ..  .. ..- attr(*, "caching")= chr "mmnoflush"
##  ..  ..- attr(*, "virtual")= list()
##  ..  .. ..- attr(*, "Length")= int 4529
##  ..  .. ..- attr(*, "Symmetric")= logi FALSE
##  ..  .. ..- attr(*, "Levels")= chr [1:52] "Alabama" "Alaska" "Arizona" "Arkansas" ...
##  ..  .. ..- attr(*, "ramclass")= chr "factor"
##  .. .. - attr(*, "class") =  chr [1:2] "ff_vector" "ff"
##  .. $ DEST_CITY_NAME   : list()
##  ..  ..- attr(*, "physical")=Class 'ff_pointer' <externalptr> 
##  ..  .. ..- attr(*, "vmode")= chr "integer"
##  ..  .. ..- attr(*, "maxlength")= int 4529
##  ..  .. ..- attr(*, "pattern")= chr "ffdf"
##  ..  .. ..- attr(*, "filename")= chr "/home/umatter/Dropbox/Teaching/HSG/BigData/BigData/materials/slides/ffdb/subs1.ff$DEST_CITY_NAME.ff"
##  ..  .. ..- attr(*, "pagesize")= int 65536
##  ..  .. ..- attr(*, "finalizer")= chr "close"
##  ..  .. ..- attr(*, "finonexit")= logi TRUE
##  ..  .. ..- attr(*, "readonly")= logi FALSE
##  ..  .. ..- attr(*, "caching")= chr "mmnoflush"
##  ..  ..- attr(*, "virtual")= list()
##  ..  .. ..- attr(*, "Length")= int 4529
##  ..  .. ..- attr(*, "Symmetric")= logi FALSE
##  ..  .. ..- attr(*, "Levels")= chr [1:306] "Abilene, TX" "Akron, OH" "Albany, GA" "Albany, NY" ...
##  ..  .. ..- attr(*, "ramclass")= chr "factor"
##  .. .. - attr(*, "class") =  chr [1:2] "ff_vector" "ff"
##  .. $ DEST_STATE_NM    : list()
##  ..  ..- attr(*, "physical")=Class 'ff_pointer' <externalptr> 
##  ..  .. ..- attr(*, "vmode")= chr "integer"
##  ..  .. ..- attr(*, "maxlength")= int 4529
##  ..  .. ..- attr(*, "pattern")= chr "ffdf"
##  ..  .. ..- attr(*, "filename")= chr "/home/umatter/Dropbox/Teaching/HSG/BigData/BigData/materials/slides/ffdb/subs1.ff$DEST_STATE_NM.ff"
##  ..  .. ..- attr(*, "pagesize")= int 65536
##  ..  .. ..- attr(*, "finalizer")= chr "close"
##  ..  .. ..- attr(*, "finonexit")= logi TRUE
##  ..  .. ..- attr(*, "readonly")= logi FALSE
##  ..  .. ..- attr(*, "caching")= chr "mmnoflush"
##  ..  ..- attr(*, "virtual")= list()
##  ..  .. ..- attr(*, "Length")= int 4529
##  ..  .. ..- attr(*, "Symmetric")= logi FALSE
##  ..  .. ..- attr(*, "Levels")= chr [1:52] "Alabama" "Alaska" "Arizona" "Arkansas" ...
##  ..  .. ..- attr(*, "ramclass")= chr "factor"
##  .. .. - attr(*, "class") =  chr [1:2] "ff_vector" "ff"
##  .. $ CANCELLATION_CODE: list()
##  ..  ..- attr(*, "physical")=Class 'ff_pointer' <externalptr> 
##  ..  .. ..- attr(*, "vmode")= chr "integer"
##  ..  .. ..- attr(*, "maxlength")= int 4529
##  ..  .. ..- attr(*, "pattern")= chr "ffdf"
##  ..  .. ..- attr(*, "filename")= chr "/home/umatter/Dropbox/Teaching/HSG/BigData/BigData/materials/slides/ffdb/subs1.ff$CANCELLATION_CODE.ff"
##  ..  .. ..- attr(*, "pagesize")= int 65536
##  ..  .. ..- attr(*, "finalizer")= chr "close"
##  ..  .. ..- attr(*, "finonexit")= logi TRUE
##  ..  .. ..- attr(*, "readonly")= logi FALSE
##  ..  .. ..- attr(*, "caching")= chr "mmnoflush"
##  ..  ..- attr(*, "virtual")= list()
##  ..  .. ..- attr(*, "Length")= int 4529
##  ..  .. ..- attr(*, "Symmetric")= logi FALSE
##  ..  .. ..- attr(*, "Levels")= chr [1:4] "" "A" "B" "C"
##  ..  .. ..- attr(*, "ramclass")= chr "factor"
##  .. .. - attr(*, "class") =  chr [1:2] "ff_vector" "ff"
##  $ row.names:  NULL
## - attributes: List of 2
##  .. $ names: chr [1:2] "virtual" "physical"
##  .. $ class: chr "ffdf"
dim(subs1.ff)
## [1] 4529    7
dimnames(subs1.ff)
## [[1]]
## NULL
## 
## [[2]]
## [1] "FL_DATE"           "AIRLINE_ID"        "ORIGIN_CITY_NAME"  "ORIGIN_STATE_NM"  
## [5] "DEST_CITY_NAME"    "DEST_STATE_NM"     "CANCELLATION_CODE"

Export to CSV

#  Export subs1.ff into CSV and TXT files:
write.csv.ffdf(subs1.ff, "subset1.csv")

References

Walkowiak, Simkon. 2016. Big Data Analytics with R. Birmingham, UK: PACKT Publishing.